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Abstract — Blind system identification is known to be a 
hard ill-posed problem and without further assumptions, no 
unique solution is at hand. In this contribution, we are 
concerned with the task of identifying an ARX model from 
only output measurements. Driven by the task of identifying 
systems that are turned on and off at unknown times, we 
seek a piecewise constant input and a corresponding ARX 
model which approximates the measured outputs. We phrase 
this as a rank minimization problem and present a relaxed 
convex formulation to approximate its solution. The proposed 
method was developed to model power consumption of electrical 
appliances and is now a part of a bigger energy disaggregation 
framework. Code will be made available online. 

I. Introduction 

Consider an auto-regressive exogenous input (ARX) 
model 

y{t)-aiy(t - 1) a na y(t - n a ) 

=b\u(t — rife — 1) H h b nb u(t - n k - n b ) (1) 

with input u e 3? and output y e 5J. Estimation of 
this type of model is probably the most common task in 
system identification and a very well studied problem, see for 
instance [24]. The common setting is that {{y(t), uit))}^ 
is given and the summed residuals 

N / n b n a \ 2 

t = n y fei = l fe 2 =l J 

where n = max(n a , rik+rib) + 1, is minimized to obtain an 
estimate for ai , . . . , a„ Q ,bi, . . . ,b nb . This estimate is often 
referred to as the least squares (LS) estimate. 

In this paper we study the more complicated problem of 
estimating an ARX model from solely outputs {y(t)}^ =1 . 
This is an ill-posed problem and it is easy to see that under 
no further assumptions, it would be impossible to uniquely 
determine on, . . . , a na , &i, . . . , b nb . 

We will in this contribution study this problem under 
the assumption that the input is piecewise constant. This 
is a rather natural assumption and a problem faced in 
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many identification problems. Consider e.g., the modeling 
of an electrical appliance where the power consumption is 
monitored while the appliance is turned on and off. The exact 
time for when the appliance was turned on and off is not 
known and neither is the amplitude of the "input". 

It should be noticed that the assumption of a piecewise 
constant input is not enough to uniquely determine the input 
or the ARX model. Specifically, we will not be able to decide 
the input or the ARX coefficients b\ , . . . , b nt more than up to 
a multiplicative scalar. However, for many applications this 
is sufficient, as we will illustrate in the numerical section. 

The task of identifying a model from only outputs is in 
system identification referred to as blind system identification 
(BSI). It is known to be a difficult problem and in general 
ill-posed. 

II. Background 

Our work is motivated by blind system identification 
which is a fundamental signal processing tool used for 
identifying a system using only observations of the systems 
output. Formally, given the output signal of a system, BSI 
serves as a tool for estimating the unknown inputs and system 
model [2]. 

w 
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Fig. 1. Input-Output Model of a System 

Consider the block diagram in Figure 1 . Suppose that the 
system we want to identify is linear. Given only y, BSI is 
used to identify the input u and the system transfer function 
H. In this way, BSI is a method for solving the inverse 
problem of system identification without input information. 
Consider now a discrete linear, time-invariant system. Note 
that we describe the theory in this section for discrete time 
systems but the continuous time counterpart can be derived in 
a similar fashion. The output can be written as a convolution 
model, i.e., 

y{t) = u(t) * h(t) + w{t) (2) 

where * is the convolution operator and w is a noise term. 
This problem can be transfered to the frequency domain by 
applying the Fourier transform to get the following system: 

Y(s) = H(s)U(s) + W(s) (3) 

An alternative name for BSI when the system is linear, 
time-invariant is blind deconvolution. For a known input 



u(k), a deconvolution process can be applied to yi{k) to 
approximate hi(k). For instance, using a pseudo-inverse filter 
which is an approximation of the Weiner filter, the result of 
deconvolution gives 



U(s) 



(4) 



\H{s)\ 2 + C 

where (-)t denotes the pseudo-inverse and C is a constant 
chosen based on heuristics and serves to prevent amplifica- 
tion of noise [12]. However, we are interested in solving the 
problem with unknown inputs. When the input is unknown, 
usually partial information about the statistical properties of 
{u(t)} is required in order to obtain a good approximation 
of the output {y(t)}. Further, how the partial information is 
used in the identification problem plays an important role in 
the quality of the solution [23]. 

Typically, system identification requires information on the 
input and the output of a system in order for the problem to 
be well-posed and to reconstruct the system itself. However, 
in many applications, e.g., data communications, speech 
recognition, image restoration and seismic signal processing, 
this information is not readily available. Broadly speaking, in 
all these application areas we can describe the identification 
problem using the following abstract formulation. 

Suppose there is a signal that is transmitted through 
a 'channel' that can be described using a linear, time- 
invariant model with a single input and p outputs. The 
input to the system {u(t)} results in N of output sequences 
{yi(t)},...,{y p (t)}. Let {h!(t)},...,{h p (t)} denote the 
finite impulse responses (FIR's) which are of order K. As we 
noted above, a linear, time-invariant system of this type can 
be described using the convolution model in Equation (2) for 
each i £ {1, . . . ,p}, i.e., 



yi{t) = u(t)*hi(t) + Wi(t). 
This model can be concisely as 

y = Hu + w 

where 



(5) 



(6) 



(7) 



y-=[yj ••• yJ] T 

with each y 4 = [^(1) • • • yi(N)] T , and 

u = [u{-K) u(-K +1) • • • u(N - 1)] T • (8) 
The matrix H takes the form 

H:=[H T ••• HJ] T (9) 
where each EL is an N x (N + K) filtering matrix given by 
\hi{K) ••• hi(0) ••• 



H, 







hi(K) 



Mo) 



(10) 



Now that the system has been written in this form, we can 
formulate the BSI or blind deconvolution problem and ask 
when it has a well-defined solution. If a system identification 
problem is well-posed, then all the unknown parameters can 



be uniquely determined given the data. Given y and w = 0, 
then we can only hope to solve the system in Equation (6) 
for unique u and H up to a scalar [2]. In this case we call 
the system identifiable. Necessary and sufficient conditions 
for identifiability are given in [21] and summarized in [2]. 

There are a number of methods for estimating either the 
input u or the system function H. Once either the input 
or the system matrix has been estimated, the other can 
be calculated using the estimate. The application typically 
determines whether a direct estimation of input or system 
matrix should be done. For instance, in communication ap- 
plications the input carries the information and as such direct 
estimation should be used for the input and the system matrix 
should be calculated after. The input u can be estimated 
using the following methods: input subspace (IS) method, 
mutually referenced equalizers (MRE), or linear prediction 
(LP) method (see [2], [17], [27]). The system matrix can 
be directly estimated using the maximum likelihood (ML) 
method (see, for instance, [32]) and the subspace method 
(see [1]). We remark that in the above formulation we have 
considered only FIR models. These tend to be sufficient in 
practice considering that infinite impulse responses can be 
approximated by FIR's and modeling with FIR's results in 
problem formulations that have tractable solutions. 

In this paper we are concerned specifically with estimating 
an ARX model from only output observations and we 
formulate the problem using the BSI framework. 

III. Problem Formulation 

Given {y(t)}^ =1 € Sft and a bound for the noise e, find an 
estimate for ai , . . . , a„ a , b\ , . . . , b Ub € 3? and an over time 
piecewise constant u(t) € t = 1, . . . , N, such that 

y(t)-aiy(t - 1) a na y(t - n a ) 

=biu(t - n k - 1) -I h b nb u(t -n k - n b ) + w(t), 

for t = n, . . . , N, where n = max(n a , n k + rib) + 1, and 

\w(t)\ < e, t = n,...,N. (11) 

We will for simplicity assume that n a , rib, nk, are known. To 
make the problem well posed, we will seek the piecewise 
constant input with the least amount of changes. Other 
choices have been studied for the related problem of blind 
deconvolution, see [3] for a solution where the signals to be 
recovered are assumed to be in some known subspaces. 

IV. Notation and Assumptions 

We will use y to denote the output and u the input. We 
will for simplicity only consider single input single output 
(SISO) systems. We will assume that N measurements of y 
are available and stack them in the vector y, i.e., 



y=[y(l) ... y(A0] T . 



(12) 



We also introduce u, w, a and b as 



u=[«(l) • 


.. u(N)] T , 


(13) 


w = [w(l) 


•• w(N)] T , 


(14) 


a = [ai . . . 


a «a] T 7 


(15) 


b = [6i ... 




(16) 



We will use y(i) to denote the ith element of y. To pick out 
a subvector of y consisting of the ith to the jth element we 
will use the notation y(i : j) and similarly for picking out a 
subvector of u, a and b. To pick out a submatrix consisting 
of the ith to the jth rows of X we use the notation X(i : j, :). 

We will use normal font to represent scalars and bold for 
vectors and matrices. j| • j|o is the zero norm which returns 
the number of nonzero elements of its argument and || ■ || p the 
p-norm defined as ||y|| p 4 ^JyVW- WMij is used to 
denote the combination of the i-norm with the j-norm. The 
z-norm is applied to each row of X and the j-norm on the 
resulting vector. We will use Am to denote the (N — 1) x 1 
row vector made up of consecutive differences of u's, 

Au =u(l : N - 1) - u(2 : N) 

= [u(l)-«(2) ••• u(N - 1) - u(N)] . 

V. Blind Identification using Lifting 

We can formulate the problem of finding the input that 
changes most infrequently and the ARX coefficients as the 
non-convex combinatorial problem 



mm 

u(t),w{t)t = 1,...,N. 
oi, . . . , a„ , bi, . . . ,b„ b 



l|Au|| 0) 



(17a) 



subj. to y(t) - aiy(t - 1) a„ a y(t - n a ) (17b) 

= bm(t - rik - 1) H + b„ b u(t - n k - n b ) + w(t), (17c) 

\w(t)\<e, t = n,...,N, (17d) 

with the zero-norm counting the number of nonzero el- 
ements of Am. Note that the combinatorial nature of the 
zero-norm alone makes (17) difficult to solve. In addition 

{o*}fc=i. {h}klv and {«W}f=i ^ unknown, 

which makes even small problems (N small) difficult to 
solve. 

Introduce X = ub T e $l Nxnb . If we assume that ||b|| 2 ^ 
0, the objective of (17) can be written as 

||A«||o=||||b|| 2 (u(l:JV-l)-u(2:JV))|| 

=||X(1 : JV - 1, :) - X(2 : JV, :)|| 2> o (18) 
Problem (17) can now be reformulated as 

min ||X(1 : JV - 1, :) - X(2 : JV, :)|| 2>0 (19a) 

X,w,a,b 



subj. to y(t) = y~]X(t-n k - fci, fci) 



fci=i 



+ X] a^vit- k 2 )+w(t), 
fe 2 =i 

\w(t)\ < e, t = n,...,N, 
ranfc(X) = 1. 



(19b) 

(19c) 

(19d) 
(19e) 



This problem is equivalent with (17) in the following sense. 
Assume that (19), has a unique solution X*, then X* must 
satisfy X* = u*(b*) T , with u* and b* solving (17). 
Extracting the rank 1 component of X*, using e.g., singular 
value decomposition, we can hence decide both u* and b* 
up to a multiplicative scalar (note that we can never do better 
with the information at hand, not even if we would be able 
to solve (17)). The estimate of a will be identical for both 
problems. 

The technique of introducing the matrix X to avoid 
products between u and b is well known in optimization 
and referred to as lifting [31], [26], [28], [18]. 

Problem (19) is combinatorial and nonconvex and there- 
fore not easier to solve than (17). To get an optimization 
problem we can solve, we relax the zero norm with the l\- 
norm and remove the rank constraint and instead minimize 
the rank. Since the rank of a matrix is not a convex function, 
we replace the rank with a convex heuristic. Here we choose 
the nuclear norm, but other heuristics are also available (see 
for instance [16]). We then obtain the convex program 

min ||X||. + A||X(l:JV-l,:)-X(2:JV,:)|| 2 ,i (20a) 

X,w,a,b 



subj. to y(t) = ^2 X(t — nk — ki, ki) 



fc 1= i 



+ a k2 y(t- k 2 ) + w(t), 

k 2 =l 

\w{t)\<e, t = n,...,N, 



(20b) 

(20c) 
(20d) 



which we refer to as blind identification via lifting (BIL) 
of ARX models with piecewise constant input. A > is a 
design parameter that roughly decides the tradeoff between 
rank of X and the number of changes in the input. Ideally, 
A is set to some large number and then decreased until the 
solution X to BIL becomes rank one. 

VI. Analysis 

In this section, we highlight some theoretical results 
derived for BIL. The analysis follows that of CS, and is 
inspired by derivations given in [30], [10], [9], [13], [15], 
[8], [4], [7], [11]. 

We need the following generalization of the RIP-property. 

Definition 1 (RIP): We will say that a linear operator A : 

Sft7uxn 2 sjjn 3 is ( £; fc ) _ RIP if 



\\A(Z) 



izo; 



i 



< £ 



(21) 



for all m x n 2 -matrices Z satisfying 

0=||Z(1,:)-Z(2,:)|| 2i0 (22) 
0=||Z(n 1 -l,:)-Z(n 1 ,:)|| 2i0 (23) 
<||Z(1 : m - 1, :) - Z(2 : m, :)|| 2;U < k (24) 

and Z^0. Z(:) is here used to denote the vectorization of 

the matrix Z. 

We can now state the following theorem: 
Theorem 1 (Uniqueness): If Z satisfies b = A(Z), 



0<||Z(l:ni-l,:)-Z(2:n 1 ,:)|| 2>0 <fc 



(25) 



and A is (e, 2k) — RIP with e < 1 then there exist no other 
solutions to b = A(Z) satisfying (22)-(24). 

Proof: Assume the contrary, i.e., that there exist another 
solution Z such that Z ^ Z and that satisfies (22)-(24). It 
is clear that (22) and (23) hold. In addition, 

< ||Z(1 : m - 1,:)-Z(1 : m - 1,:) 

-(Z(2:n 1 ,:)-Z(2:n 1 ,:)))|| 2l o <2fc. (26) 

Hence (21) must hold for Z - Z. But since A(Z) = A{Z) = 
b we get from (21) that 1 < e, which is a contradiction. We 
hence have that Z is unique solution to b = A(Z) satisfying 
(22)-(24). ■ 

The following corollary now follows trivially. 

Corollary 2 (Recoverability): Let Z* be the solution of 

mm ||Z||, + A||Z(1 : m - 1, :) - Z(2 : n u :)|| 2> i 
subj. to b=A{Z). 

(27) 

If A is (e, 2k) - RIP with e < 1, Z* satisfies (22)-(24) and 
rank(Z*) = 1, then Z* is also the solution of 

min ||Z(l:ni-l,:)-Z(2:ni,:)|| 2 ,o 
z (28) 
subj. to b = A(Z), rank(Z) = 1. 
Proof: The corollary follows directly from Theorem 1 . ■ 
It is easy to see that (20) has the same form as (27) and (28) 
as (19). Corollary 2 hence provides necessary conditions for 
when the relaxation, going from (19) to (20), is tight. 

VII. Solution algorithms and software 

Many standard methods of convex optimization can be 
used to solve the problem (20). Systems such as CVX 
[20], [19] or YALMIP [25] can readily handle the nuclear 
norm and the sum-of-norms regularization. For large scale 
problems, the alternating direction method of multipliers 
(ADMM, see e.g., [5], [6]) is an attractive choice and 
we have previously shown that ADMM can be very effi- 
cient on similar problems [30]. Code for solving (20) will 
be made available on http://www.rt.isy.liu.se/ 
-ohlsson/ code . html 

VIII. Numerical Illustrations 
A. A Simple Noise Free FIR Example 

In this example, given {y{t)}fZi and n a = 0,7i{, = 3, 
we illustrate the ability to recover the FIR model used to 
generate {y(t)}^i an d the correct piecewise constant input 
{■(/(i)}^ (up to a multiplicative scalar). The given y is 
shown in Figure 2 and the input that was used to generate y in 
Figure 3. The true b was [-7.4111 -5.0782 -3.2058] . 

To recover {u(t)}f^i and b we use BIL. e was set to and 
A was increased until the first singular value was significantly 
larger than the second singular value. A = 10 4 gave a first 
singular value of 64.3 and a second singular value of 9.8 x 
10~ 6 . The estimated input can for this A not be distinguished 
from the true and the estimate for b is equal to the true b up 
the to the numerical precision of the solver after rescaling. 

On this simple example, a method that first estimates the 
input and then the FIR coefficients (for instance [2], [17], 
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Fig. 2. The true output of the FIR model identified in Section Vlll-A. 
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Fig. 3. The true input of the FIR model identified in Section VIII-A. 

[27]) works pretty well. In particular, the naive approach 
of first estimating a piecewise input by fitting a piecewise 
constant signal to the output measurements (use e.g., [22], 
[29]) and secondly estimate the FIR coefficients gave an as 
good result as BIL. 

B. Identifying an ARX Model From Noisy Data 

In this example we use the same input as in the previous 
example but modify the system to be 

z(t) =0.2z(t - 1) - 4.9594u(i - 1) 

+6.1774u(>- 2) + 3.3930w(t-3). (29) 

We also assume that there is a uniform measurement noise 
between —2 and 2 added to the output, 

y(t) = z(t) + e(t), e(i)~tf(-2,2). (30) 

Given {^(i)}^! we now aim to find a model of the form 

z(t) = aiz(t-l)+biu(t-l)+b 2 u(t-2)+b 3 u(t-3), (31) 

and a piecewise constant input {u(t)}f=i- The given output 
sequence {y{t)}t=i is shown in Figure 4. 





Fig. 4. The output measurements {y(t)}t=i given as filled circles and the 
estimated simulated output obtained by feeding the estimated ARX model 
with the estimated inputs depicted using solid line. 



Fig. 6. The true input shown with dashed line and the final estimate of 
the input, after the refinement step, shown with solid line. 



If we apply BIL with A = 10 7 and e = 2 the input shown 
with solid line in Figure 5 is found. The input associated 
with the second largest singular value is also shown (gray 
thin line). The two largest singular values were 43 and 15. 
Figure 5 also shows the true input with dashed line. Figure 4 
shows the output generated by driving the estimated ARX 
model with the input estimate (both corresponding to the 
largest singular value of X). 
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Fig. 5. The true input shown using dashed line, the estimate of the input 
associated with the first singular value of X with solid and the estimate 
associated with the second largest singular value shown with thin gray line. 

As in Lasso [33] and my other £i-regularization problems, 
it is useful with a refinement step to remove bias. Simply set 
A = in BIL and add the constraint 

Au(i) = if |Au*(i)|<7, i = 1, . . . , N - 1, (32) 

where Au* is the previous estimate of Au and 7 > 0. If 
we chose 7 = 0.5 the input shown in Figure 6 is the result. 
The corresponding output obtained by driving the estimated 
ARX model with the estimated input (both corresponding to 



the largest singular value of X after the refinement step) is 
given in Figure 7. The two first singular values were now 
44 and 5. The estimate for a was 0.2 and b\ = —4.5594, 
b 2 = 5.7741, and 63 = 4.0817. 




Fig. 7. The generated output obtained by feeding the estimated refined 
ARX model with the refined output estimates shown with solid line. The 
measured noisy outputs shown with solid circles. 

On this more challenging example, the naive method of 
first estimating a piecewise input and secondly estimate the 
ARX coefficients did not give a satisfying result. Figure 8 
shows the result of fitting a piecewise constant signal to the 
outputs and Figure 9 shows compares the true input with the 
estimated input for the naive method. 

C. A Real Data Example 

This example is motivated by energy disaggregation. The 
problem of disaggregation refers to the problem of decom- 
posing an aggregated signal into its sources. As an example, 
the aggregated signal could be the total energy consumed by 
a house. The sources would then be the energy consumed 
by different appliances, e.g., the toaster, HVAC, dishwasher 




Fig. 8. A piecewise constant signal fitted to the measured output. 
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Fig. 9. The true input (dashed line) and the estimated input (solid line) 
by the naive method. 
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Fig. 10. Measured and estimated power consumption of a toaster. 
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Fig. 11. Measured and estimated power consumption of a toaster. 



etc. In [14], we present a disaggregation algorithm which 
utilizes models for individual appliances. To model different 
appliances, the power of individual appliances was measured 
as they were turned on and off. Figures 10 and 11 show 
the measured power of a toaster as it was turned on at two 
different times. To estimate a model for the toaster, we need 
to estimate both the input and the model at the same time. In 
addition, we do not want to assume that the input is binary 
since many appliances have settings that may have changed 
from one time to the next, e.g., the temperature setting of a 
toaster etc. We make the assumption that a change in e.g., the 
temperature of the toaster can be modeled by different input 
amplitudes. It is therefore more natural to assume that the 
input is piecewise constant rather than binary. 

We chose to use n a = rib = 8, = e = 0.04 
and A = 10 8 . We subtracted the total mean of both power 
measurements and sought two input sequences and a set of 
parameters that well approximate the two power measure- 
ment sequences. This resulted in the input estimates shown 
in Figures 12 and 13. 



The ARX parameter were computed to: 



" 0.0191 " 




" 4.6219 " 


0.0004 




-0.0527 


-0.0006 




-0.0527 


0.0098 




-0.0527 


0.0053 


, b = 


-0.0567 


0.0065 




-0.0567 


0.0231 




-0.0567 


-0.0135 




-0.0683 



(33) 



Simulating the model provides the power estimates also 
shown in Figures 10 and 11. The two largest eigenvalues 
were 32 and 0.07. The found solution is hence very closet 
to being a rank 1 matrix. 

Given aggregated power measurements, we can now use 
the model of the toaster and seek the a piecewise constant 
signal representing the toaster being turned on and off. Since 
it is the power consumption of different devices that are of 
interest in disaggregation, it is not a problem that we can not 



identify the inputs or the ARX parameters more than up to 
a multiplicative constant. 
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Fig. 12. Estimated piecewise constant input to the toaster power measure- 
ments seen in Figure 10. 
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Fig. 13. Estimated piecewise constant input to the toaster power measure- 
ments seen in Figure 1 1 . 



IX. Conclusion 

This paper presented a novel framework for BSI of 
ARX model with piecewise constant inputs. The framework 
uses the fact that the problem can be rewritten as a rank 
minimization problem. A convex relaxation is presented to 
approximate the sought ARX parameters and the unknown 
inputs. 

References 

[1] K. Abed-Meraim, J.-F. Cardoso, A.Y. Gorokhov, P. Loubaton, and 
E. Moulines. On subspace methods for blind identification of single- 
input multiple-output FIR systems. IEEE Transactions on Signal 
Processing, 45(l):42-55, Jan. 

[2] K. Abed-Meraim, W. Qiu, and Y. Hua. Blind system identification. 
Proceedings of the IEEE, 85(8): 1310-1322, 1997. 

[3] A. Ahmed, B. Recht, and J. Romberg. Blind deconvolution using 
convex programming. CoRR, abs/121 1.5608, 2012. 



[4] R. Berinde, A. Gilbert, P. Indyk, H. Karloff, and M. Strauss. Combin- 
ing geometry and combinatorics: A unified approach to sparse signal 
recovery. In Communication, Control, and Computing, 2008 46th 
Annual Allerton Conference on, pages 798-805, September 2008. 

[5] D. P. Bertsekas and J. N. Tsitsiklis. Parallel and Distributed Compu- 
tation: Numerical Methods. Athena Scientific, 1997. 

[6] S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein. Distributed op- 
timization and statistical learning via the alternating direction method 
of multipliers. Foundations and Trends in Machine Learning, 2011. 

[7] A. Bruckstein, D. Donoho, and M. Elad. From sparse solutions of 
systems of equations to sparse modeling of signals and images. SIAM 
Review, 51(1):34-81, 2009. 

[8] E. Candes. The restricted isometry property and its implications for 
compressed sensing. Comptes Rendus Mathematique, 346(9-10):589- 
592, 2008. 

[9] E. Candes, J. Romberg, and T. Tao. Robust uncertainty principles: 
Exact signal reconstruction from highly incomplete frequency informa- 
tion. IEEE Transactions on Information Theory, 52:489-509, February 
2006. 

[10] E. Candes, T. Strohmer, and V. Voroninski. PhaseLift: Exact and 
stable signal recovery from magnitude measurements via convex 
programming. Technical Report arXiv: 1 109.4499, Stanford University, 
September 2011. 

[11] E. J. Candes and Y. Plan. Tight oracle bounds for low-rank matrix 

recovery from a minimal number of random measurements. CoRR, 

abs/1001.0339, 2010. 
[12] J. N. Caron. Efficient blind deconvolution of audio-frequency signal. 

Journal of the Acoustical Society of America, 1 16(l):373-378, 2004. 
[13] A. Chai, M. Moscoso, and G. Papanicolaou. Array imaging using 

intensity-only measurements. Technical report, Stanford University, 

2010. 

[14] R. Dong, L. Ratliff, H. Ohlsson, and S. S. Shankar. A dynamical 
systems approach to energy disaggregation. In Proceedings of the 52th 
IEEE Conference on Decision and Control, Florence, Italy, December 
2013. Submitted to. 

[15] D. Donoho. Compressed sensing. IEEE Transactions on Information 
Theory, 52(4):1289-1306, April 2006. 

[16] M. Fazel, Ft. Hindi, and S. P. Boyd. A rank minimization heuristic with 
application to minimum order system approximation. In Proceedings 
of the 2001 American Control Conference, pages 4734-4739, 2001. 

[17] D. Gesbert, P. Duhamel, and S. Mayrargue. On-line blind multichannel 
equalization based on mutually referenced filters. IEEE Transactions 
on Signal Processing, 45(9):2307-2317, 1997. 

[18] M. X. Goemans and D. P. Williamson. Improved approximation algo- 
rithms for maximum cut and satisfiability problems using semidefinite 
programming. J. ACM, 42(6): 11 15-1 145, November 1995. 

[19] M. Grant and S. Boyd. Graph implementations for nonsmooth convex 
programs. In V. D. Blondel, S. Boyd, and H. Kimura, editors, Recent 
Advances in Learning and Control, Lecture Notes in Control and 
Information Sciences, pages 95-110. Springer- Verlag, 2008. http: 
/ /Stanford . edu/ ~boyd/graph_dcp . html. 

[20] M. Grant and S. Boyd. CVX: Matlab software for disciplined convex 
programming, version 1.21. http://cvxr.com/cvx, August 
2010. 

[21] Y. Hua. Fast maximum likelihood for blind identification of multiple 
FIR channels. IEEE Transactions on Signal Processing, 44(3):661- 
672, 1996. 

[22] S.-J. Kim, K. Koh, S. Boyd, and D. Gorinevsky. i\ trend filtering. 
SIAM Review, 51(2):339-360, 2009. 

[23] T.-H. Li. Blind deconvolution of discrete-valued signals. In Confer- 
ence Record of The Twenty-Seventh Asilomar Conference on Signals, 
Systems and Computers, pages 1240-1244. IEEE, 1993. 

[24] L. Ljung. System Identification — Theory for the User. Prentice-Hall, 
Upper Saddle River, N.J., 2nd edition, 1999. 

[25] J. Lofberg. YALMIP: A toolbox for modeling and optimization in 
MATLAB. In Proceedings of the CACSD Conference, Taipei, Taiwan, 
2004. 

[26] L. Lovasz and A. Schrijver. Cones of matrices and set-functions and 
0-1 optimization. SIAM Journal on Optimization, 1:166-190, 1991. 

[27] J. I. Makhoul. Linear prediction: A tutorial review. Proceedings of 
the IEEE, 63(4):561-580, April 1975. 

[28] Y. Nesterov. Semidefinite relaxation and nonconvex quadratic opti- 
mization. Optimization Methods & Software, 9:141-160, 1998. 



[29] H. Ohlsson, L. Ljung, and S. Boyd. Segmentation of ARX-models 
using sum-of-norms regularization. Automatica, 46(6): 1107-1 111, 
2010. 

[30] H. Ohlsson, A. Y. Yang, R. Dong, M. Verhaegen, and S. S. Sastry. 
Quadratic basis pursuit. CoRR, abs/1301.7002, 2013. 

[31] N.Z. Shor. Quadratic optimization problems. Soviet Journal of 
Computer and Systems Sciences, 25:1-11, 1987. 

[32] S. Talwar, M. Viberg, and A. Paulraj. Blind estimation of multiple co- 
channel digital signals using an antenna array. IEEE Signal Processing 
Letters, 1(2):29-31, 1994. 

[33] R. Tibsharani. Regression shrinkage and selection via the lasso. 
Journal of Royal Statistical Society B (Methodological), 58(1):267- 
288, 1996. 



